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Abstract 

■ J— T ' Nested regular expressions (NREs) have been proposed as a powerful formalism for querying RDFS 

graphs, but research in a more general graph database context has been scarce, and static analysis results 

("—^ ' are currently lacking. In this paper we investigate the problem of containment of NREs, and show that it 

rS] ' can be solved in PSPACE, i.e., the same complexity as the problem of containment of regular expressions 

. , or regular path queries (RPQs). 
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1 Introduction 



< 

Graph-structured data has become pervasive in data-centric applications. Social networks, bioinforniatics, 
f>^ • astronomic databases, digital libraries. Semantic Web, and linked government data, are only a few examples 

of applications in which structuring data as graphs is, simply, essential. 

Traditional relational query languages do not appropriately cope with the querying problematics raised 
y^ [ by graph-structured data. The reason for this is twofold. First, in the context of graph databases one is 

typically interested in navigational queries, i.e. queries that traverse the edges of the graph while checking for 
the existence of paths satisfying certain conditions. However, most relational query languages, such as SQL, 
are not designed to deal with this kind of recursive queries [1]. Second, current graph database applications 
tend to be massive in size (think, for instance, of social networks or astronomic databases, that may store 
fT^ ' terabytes of information). Thus, one can immediately dismiss any query language that cannot be evaluated 

VO I in polynomial time (or even in linear time!). But then even the core of the usual relational query languages 

CN ■ - conjunctive queries (CQs) - does not satisfy this property. In fact, parameterized complexity analysis tells 

■<!::j- I us that - under widely-held complexity theoretical analysis - CQs over graph databases cannot be evaluated 

^^ ■ in time 0(|G'|'^ • f{\f\)), where c > 1 is a constant and / : N — ;• N is a computable function [T4] . 

f^ I This raises a need for languages that are specific for the graph database context. The most commonly 

used core of these languages are the so-called regular path queries, or RPQs [7], that specify the existence 
of paths between nodes, with the restriction that the labels of such path belong to a regular language. The 
language of RPQs was later extended with the ability to traverse edge backwards, providing them with a 
2-way functionality. This gives rise to the notion of 2RPQs [5]. 
C^ ■ Nested regular expressions are a graph database language that aims to extend the possibility of using 

regular expressions, or 2-way regular expressions, for querying graphs with an existential test operator [(•)], 
also known as nesting operator, similar to the one in XPath [TU]. This class of expressions was proposed 
in [15] for querying Semantic Web data, and have received a fair deal of attention in the last years [TTl [2l |3] . 
We say that Here we study the problem of containment of NREs, which is the following problem: 



Problem: NRECONTAINMENT 
Input: NREs Qi and Q2 over S. 

Question: Is Qi C Q2I 



Note that we study this problem for the restricted case when all the possible input graphs are semipaths. 
The general case will be shown in an extended version of the manuscript. 
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Figure 1: A fragment of the RDF Linked Data representation of DBLP [5] available at 
http : //dblp . 13s . de/d2r/ 

2 Preliminaries 

2.1 Graph Database and queries 

Graph databases. Let V be a countably infinite set of node ids, and S a finite alphabet. A graph 
database G over E is a pair {V,E), where F is a finite set of node ids (that is y is a finite subset of V) 
and E CV X Y. X V. That is, G is an edge-labeled directed graph, where the fact that {u, a, v) belongs to E 
means that there is an edge from node u into node v labeled a. For a graph database G ~ {V,E), we write 
(m, a,v) iE G whenever (u, a, v) G E. 

Nested Regular Expressions. 

The language of nested regular expressions (NREs) were first proposed in [15] for querying Semantic Web 
data. Next we formalize the language of nested regular expressions in the context of graph databases. 

Let E be a finite alphabet. The NREs over S extend classical regular expressions with an existential 
nesting test operator [•] (or just nesting operator, for short), and an inverse operator a~ , over each a G E. 
The syntax of NREs is given by the following grammar: 



R 



e I a (a e E) | a" (a G E) | R- R 



R* \ R + R \ [R] 



As it is customary, we use n~^ as shortcut for n ■ n* . 

Intuitively, NREs specify pairs of node ids in a graph database, subject to the existence of a path satisfying 
a certain regular condition among them. That is, each NRE R defines a binary relation |i?]JG when evaluated 
over a graph database G. This binary relation is defined inductively as follows, where we assume that a is a 
symbol in E, and n, rii and n2 are arbitrary NREs: 

IeJg = {(w, u) I u is a node id in G} 

Hg = {iu,v)\{u,a,v)£G} 

la~h == {(u,w) I (w,a,u) eG} 

|ni • n2]G = IniJG ° ["-2IG 

lni + n2JG -= KIcUlnalG 

[n*lG = l£lGU[nlGU|n-nlGUln-n-nlGU--- 

I['^]Ig = {{u,u) I there exists v s.t. {u,v) e |»^]G}• 

Here, the symbol o denotes the usual composition of binary relations, that is, I«.i|g ° [^21g = {{u, v) \ there 
exists w s.t. {u,w) € |'^llG and {w,v) € |?12]g}- 



Example 2.1 Let Gi be the graph database in Figure [7J The following is a simple NRE that matches all 
pairs {x, y) such that x is an author that published a paper in conference y: 

ni = creator ■ partOf ■ series 

For example the pairs {:Jeffrey_D._Ullman, conf : foes) and ( :Ronald_Fagin, conf:pods) are in [nijc. Consider 
now the following expression that matches pairs (x, y) such that x and y are connected by a coautorship 
sequence; 

7^2 = {creator ■ creator) 

For example the pair {:John_E._Hopkroft,:Pierre_Wolper), is in |n2]G- Finally the following expression 
matches all pairs {x, y) such that x and y are connected by a coautorship sequence that only considers 
conference papers: 

713 = [creator • [partOf • series] • creator) 

Let us give the intuition of the evaluation of this expression. Assume that we start at node u. The (in- 
verse) edge creator" makes us to navigate from u to a paper v created by u. Then the existential test 
[partOf ■ series] is used to check that from v we can navigate to a conference (and thus, v is a conference 
paper). Finally, we follow edge creator from v to an author w of v. The (•)+ over the expression allows 
us to repeat this sequence several times. For instance, {.-John^E.JIopkroft, :Moshe_Y._Vardi) is in |?13]g, but 
(:John_E._Hopkroft, :Pierre_Wolper) is not in \n^\Q. 

Complexity and expressiveness of NREs The following result, proved in [15], shows a remarkable 
property of NREs. It states that the query evaluation problem for NREs is not only polynomial in combined 
complexity (i.e. when both the database and the query are given as input), but also that it can be solved 
linearly in both the size of the database and the expression. Given a graph database G and an NRE R, we 
use \G\ to denote the size of G (in terms of the number of egdes (u, a, v) S G), and \R\ to denote the size of 
R. 

Proposition 2.2 (from [15] ) Checking, given a graph database G, a pair of nodes {u,v), and an NRE R, 
whether {u,v) € I^Igj can be done in time 0{\G\ ■ \R\). 

On the expressiveness side. NREs subsume several important query languages for graph databases. For 
instance, by disallowing the inverse operator a~ and the nesting operator [ • ] we obtain the class of regular 
path queries (RPQs) [3 US], while by only disallowing the nesting operator [ • ] we obtain the class of RPQs 
with inverse or 2RPQs [5]. (In particular, both expressions rii and 77,2 in Example 12.11 are 2RPQs). In turn, 
NREs allow for an important increase in expressive power over those languages. For example, it can be 
shown that NRE expression 71,3 in Example 12.11 cannot be expressed without the nesting operator [ • ] , and 
hence it is not expressible in the language of 2RPQs (c.f. [U]). 

On the other hand, the class of NREs fails capturing more expressive languages for graph-structured 
data that combine navigational properties with quantification over node ids. Some of the most paradigmatic 
examples of such languages are the classes of conjunctive RPQs and 2RPQs, that close RPQs and 2RPQs, 
respectively, under conjunctions and existential quantification. Both classes of queries have been studied in 
depth, as they allow identifying complex patterns over graph-structured data [SI [51 H] . 

3 Containment of NREs over paths 

3.1 Problem Definition 

Let us begin with some notation. 

Along the proof we assume that S includes all reverse symbols. More precisely, if S' is an alphabet, we 
work instead with the alphabet S = S' U {a" \ a G S'}. Let G = {V,E) be a graph over E. A semipath 
in G is a sequence iti, ai, 7/2, 02, • ■ • , u„i, am, Um+i,, where each u,; belongs to V, each a^ belongs to S, and 



for each Ui, a,;, u^+i, we have that {ui, ai, u^+i) belongs to E^ if a; is not a reverse symbol, and {ui+i^ Ui, ui) 
belongs to E is a^ is a reverse symbol, i.e., of form a~ for some a € S'. A semipath is simple if all of its 
nodes arc distinct. Finally, a graph G resembles a (simple) semipath if there is a (simple) semipath tt in G of 
the form above such that the nodes of G are precisely {ui, . . . , u„} and the edges of edges of G are precisely 
those that witness the above definition. 

As we have mentioned, we study NREContainment only when the input graphs are scmipaths. We 
arc now ready to describe our goal which is to show that the following problem is in PsPACE: Given NREs 
Qi and Q2 over E, decide wether |Qi]g ^ [Q2IG, for all graphs G over E such that G resembles a simple 
semipath. In what follows, we refer to this problem as SP-NREContainment. 

3.2 Alternating 2- way finite automata 

Following [12], an Alternating 2-way finite automaton, or A2FA for short, is a tuple A = {Q,qo,U,F,T,,d), 
where Q is the set of states, U C Q is a. set of universal states, go is the initial state, i^ C Q is the set of 
final states, E is the input alphabet (we also use symbols % and $ not in E as the start and end markers of 
the string), and the transition function is (5 : Q x (E U {%, $}) -^ 2Q'''f"i-"'i>. 

The numbers —1,0,1 in the transition stand for moving back, staying and moving forward, respectively. 
The input is delimited with % at the beginning and $ at the end. For convenience, we assume that the 
automaton starts in state qo while reading the symbol $ of the string. Finally, we impose the following 
restriction on 6: the machine can only branch into an existential or universal state while the automaton is 
not moving backwards or forwards. That is. If for some states p,q and symbol a £ E we have that (p, —1) 
or {p, 1) belong to 6{q,a), then it must be the case that S{q,a) = {{p, —1)} or {p, 1). Obviously wc do not 
loose expressive power while imposing this restriction, and we can still simulate any non-deterministic two 
way automaton with A2FAs. 

Semantics Semantics are given in terms of computation trees over instantaneous descriptions. An instan- 
taneous description (ID) is a triple of form {q,w,i), where q is a state, w is a word in %ct*(£|| $) and 
1 < i < \w\ + 1). Intuitively, it represent the state of the current computation, the string it has already 
read, and the current position of the automata. An ID is universal ii q E U and existential otherwise, and 
accepting IDs are of form {q, w, \w + 1\) for w £ %E*$ and q & F. 

Let w = fli, . . . , a„, for each a^ G E U {%, $}. The transition relation => is defined as follows: 

• {q, w, i) ^ {p, w, i), if (p, 0) G S{q, Oi) and I < i < n; 

• (q, w, i) => {p, w,i + 1), if (p, 1) G 5{q, Oi) and 1 < i < n; and 

• ((J, w, i) =► (p, w, i — 1), if (p, —1) G 5{q, Oi) and 1 < i < n. 

A computation tree 11 of an A2FA A ~ {Q, go, U, F, E, 5) is a finite, nonempty tree with each of its nodes 
TT labelled with an ID l{n), and such that 

1. If TT is a non-leaf node and /(tt) is universal, let /i, . . . ,Ik be all IDs such that 1{tt) =^ Ij for each 
1 f^ .7 ^ fc- Then TT has exactly k children tti, . . . , tt^, where Kj^j) = Ij', and 

2. If TT is a non leaf node and /(tt) is existential, then tt has exactly one child tt' such that 1{tt) ==> ^(tt'). 

Finally, an accepting computation tree of A over w is a computation tree 11 whose root is labelled with 
{qo,%w$, \w\ + 2) and each of it leaves are labelled with an accepting ID. 

We need the following theorem. It follows immediately from the results in |12j : 

Proposition 3.1 Given a A2FA A, it is PSPACE-complete to decide wether the language of A is empty. 



3.3 Proof of SP-NRECONTAINMENT 

The idea is to code acceptance of strings by NREs using alternating 2-way automata. More precisely, given 
an NRE R, we construct an A2FA Afi such that the language of Ajf corresponds, in a precise sense, to all 
those words w such that Ji^lc is nonempty for all those graphs G that resemble the simple semipath w. 

Construction of An. We define the translation by induction, all states are existential unless otherwise 
noted. Along the construction, we shall be marking, in each step, a particular state of the automata. We use 
this mark in the construction. Furthermore, for the sake of readability we shall drop the assumption that 
these automaton arc deterministic when moving the head forward or backwards. 

• If i? = a, then Aji = (E, {qq, qj, g^}, 0, <Iq, 5, qj), with 5 defined as: 

Kqo,a) = {(<?/, l),(9r,-l)} 

S{qo,b) = {(^r, — 1)} ,for each 6 G S, 6 7^ a 

S{qr,a') = {((?/, 0),} 

State qr and the two way functionality is added so that the automaton correctly accepts when the 
input is a word of form S*a^S* (See [5] for a thorough explanation of this machinery). Moreover, 
state qf is marked. 

• Similarly, if i? = a^, then Aj^ ~ (S, {go, 9/, ^r}, 0, 90, (5, 9/), with (5 defined as: 

5(go,a") = {('7/,l),('7r,-l)} 

5(go,^) = {(^r, — 1)} ,for each & 6 S, & 7^ a~ 

b{(ir,a) = {((?/, 0),} 

State 9/ is marked. 

• Case when _R = i?i + Ri. Let Aexp; — (5^, Q', t^N^O' '^'i -^*)' f^^' * = 1,2, and assume that g^^ is the 
marked state from Aej;p,. Define Ar = (S, Q, C/, go, ^,^), where Q = {go,<7/} U Q^ U Q^, U = C/^Ut/^, 
F = {g/} U (i^^ \ {g,i„}) U (i^^ \ {g,^}) and 5 = (5^ U 5^, plus transitions 



(5(90, e) = 


= {(gi,0),(g2,0)} 


'5(9^, £) = 


= {('Z/^O)} 


<5(g^,£) = 


= {(9/,0)} 



For each i — 1, 2, remove al marks from A/j. , and mark state g/. 

• In the case that R = R\ ■ R2, let A^xpi = (S, Q*, t/*, gp, (5*, F*), for i = 1, 2, and assume that g^„ is the 
marked state from A^xpi- For each i ~ 1,2, remove al colorings from Afj., remove states gg and g^ 
from Q' and remove from (5* all transitions mentioning gp or g' . Define A/j = (S, Q, U, go, (5, F), where 
Q = {go, g/} UQ^UQ\U = U^U U\ F = {g^} U [F^ \ {g^}) U [F^ \ {g^}) and ^ = 5^ U 5\ plus 
transitions 

%o,e) = {{ql,Q)] 
5{ql,e) = {(go^O)} 

For each i = 1, 2, remove al marks from A^i, and mark state gj. 

• For i? = Rl, let Ae^jp^ = (S, Q^, C/^, gg, (5^,F^), and assume that g^^ is the marked state from Aexp^- 



Define Ar = {^,Q,U\q^,5,F), where Q = {qo,?/} ^ Q\ F ^ {qf} U (F^ \ {gij) and 5 = <5i plus 

transitions 

5{q,,e) = {(gi,0)} 

<5(gi„£) = {(g/,0),(qi,0)} 
Remove al marks from Ar-^ , and mark state gy . 

• When R = [-Ri], let ^e^pi = (5^,Q^, C^^jQ'oj^^j-P'^)! ^-iid assume that q}^ is the marked state from 
A^^p,. Then Afl = {T,,Q,U\qo,5,F), where Q = {go,P,g2,9/} U Q^, U = U^\J{p}, F = {qj} U F^ 
and 5 = S^, plus transitions 

<5(9o,e) = {(p,0)} 

J(p, e) = {(^/j 0)1 (9^ 0)} (recall that p is a universal state) 

^(9m: a) ^ {{ql,, 1)} for each a G S 

Remove al marks from Ar^ , and mark state qf . 

Let Aj^ = {Q,qQ,U,F,Y,,5) be as constructed by this algorithm. To finish our construction we need to 
allow An to (non deterministically) move backwards from the end of the word, until it reaches a suitable 
starting point for the computation, and allow every final state to reach the end of the word in its computation. 
Formally, we define A'^^ = (Q U {(/g } , (Jq , C/, F, E U {$} , (5') , where 5' contains all transitions in 5 plus transitions 
d{q'Q,a) = {((70,0), (qo) —1)} for each a € E U {$} and S{qf,a) = (g/, 1) for each a G S. 

Notice that the above construction can be computed in polynomial time with respect to R. Furthermore, 
let qjn be the marked state of A'j^. From its construction, it is clear that every accepting computation tree 11 
of An on input w will have the following form: (1) For some 1 <i <\w\ there is a single path from the root 
to a node tTs such that 1{'k) = {qo,w,i) and no ancestor of tt is labelled with an ID using a state different 
from gp; and (2) there is some 1 < j < |w| such that there is exactly one path of nodes tt/, tt^ Tr'i, . . . labelled 
with [qm,w,j), {qm,w,j + 1), . . . , {qm,w, \w\ + 1). Property (1) represents the automaton searching for its 
starting point, and (2) represents the end of the computation of the part of A'^ that is representing the 
non-nesting part of R. We denote such nodes tTs and tt/ as the tacit start and tacit ending of 11. With this 
definitions we can show the following. 

Lemma 3.2 Let S he a graph over S that is a semipath, w the label of the path S, and R a NRE. Then a 
pair {ui,Uj) belongs to |i?]s */ o.i^'d only if there is an accepting computation tree of A'j^ on input w whose 
tacit start is labelled with {qo,w,i) and whose tacit ending is labelled with {qm,w,j). 

Proof: Let S be the semipath ui,ai,U2,a2, ■ ■ ■ jUrmCLrmUm+i, and therefore w = ai a™, and let 

A'j^ = (S, Q, U, goi S\ P) constructed as explained above. 

For the only if direction, assume that [i?]s contains the pair (ui, Uj), I < i,j < m + 1. We prove the 
above statement by induction on R. 

• If i? = a, for some a G S, and {ui,Uj) G |^]s: then either j ~ i + 1 and the edge {ui,a^Uj) is in 5*, 
or j = i — 1 and the edge {uj, a~ , Ui) is in 5*. In the former case the existence of a computation tree 
is obvious, for the latter case observe that one could use the transitions (goi w, i) => (g^, w, i — 1), and 
then since {uj,a~ ,Ui) is in S we follow transition {qr,w,i — 1) => {qf,w,i — 1). 

• Case for R ~ a^ is analogous to the previous one 

• li R = Ri + R2 and {ui,Uj) G [i?]s, then {ui,Uj) G |-Rfc]s for fc = 1 or fc = 2, which entails a proper 
accepting computation tree for A^-^ (^-^2) on input w. The statement follows immediately from the 
construction of Ar. 



• If i? = i?i • i?2 and {ui,Uj) G |-R]s, then there is a node Uk of S such that (ui,Uk) £ I^ils and 
{uk,Uj) e |i?2ls- Assume that the initial and marked nodes of A^-^ and A^^ are qgi Qm ^^'^ 9oi 9m i 
respectively. From the induction hypothesis we have that there are accepting computation trees for 
An-^ and An^ whose tacit starts are {qQ,w,i) and {qQ,w,k), respectively, and the tacit ending of the 
first tree is labelled with {ql^,w,k). Since A'j^ has, by construction, the pair {qo,0) in S{q'^,e), we can 
cut the first tree in its tacit ending and plug in the computation tree for A^^, starting from its tacit 
start which proves the statement. 

• The case when R ~ R^ goes along the same lines as the concatenation, except this time we may have 
to plug in a greater number of computation trees. 

• Finally, ii R = [Ri] and {ui,Uj) € iRjs, then Ui = Uj, and there is some Uk such that {ui,Uk) G [-Rils- 
Let qp be the universal state in A'j^ that is not in A^-^ . Then the only transitions associated to qp 
are 6{qp,e) = {(qojO), (g/,0)}, with qf being the only marked (final) state of Ajj,- Our accepting 
computation tree for A'j^ has a path from the root to the tacit start, then a node labeled {qp,w,i) 
with children (qq, w, i) and (9/, w, i), with the computation tree for Aji-^ (starting from its tacit start) 
plugged into the first of these children. 

For the if direction, assume that there is an accepting computation tree of Aj^ on input w whose tacit 
start is labelled with {qo, w, i) and with its tacit ending labelled with ((/,„, w,j). We now prove that {ui, Uj) 
belong to fRjs- The proof is again by induction 

• For the base case when R = a (proof for i? = a^ is analogous) , there are two options for an accepting 
computation of Afj. Either it is of form (q^, w, i) =^ {qf, w,i + 1), in which case j = i + 1 and at = a, 
or it is of form {qo, w, i) => (g^, w, « — 1) => {qf^w, i — 1), in which case j = i — 1 and Ui — a^ . For both 
cases we obtain that {ui,Uj) G iRjs- 

• When R = Ri + R2, by the construction of Aj^, any computation tree of An can be prunned from 
its tacit start to obtain a computation tree for one of Aj^-^ or Aji^ , from where the statement easily 
follows. 

• When R — Ri- R2, we can similarly obtain computation trees for Api-^ and Ar^, and then conclude that 
{ui,Uj) belong to |i?]s. Same hold when R = R^, except in this case we obtain multiple computation 
trees for Ri. 

• Finally, if _R = [_Ri] and there is an accepting computation tree of Af( on input w whose tacit start is 
labelled with {qQ,w,i) and with its tacit ending labelled with (qm,w,j), from the construction of Af; 
the top part of the computation tree is of form {qo,w,i) => {qp,w,i) => {qQ,w,i), {q,n,w,i), where qp 
is the only universal state of An not in An-^ , and Qq is the initial state of An-^ . Then the part of the 
computation tree that follows from node (gp, w, i) comprises a computation tree for Aji-^, i.e., there is 
a Uk such that (ui,Uk) G [^i]s- This entails that (ui,Ui) G l-Rjs- 

D 

Proof for containment For our algorithm of containment, we need to be a little more careful, since for 
a word w accepted by An it is not necessarily the case that Ui and Uj are the start and finish nodes of 
the semipath S. Thus, we have to distinguish the start/end of the word with the actual piece that is 
framed by nodes Ui and Uj in the semipath. In order to do that, we augment S with two extra symbols 
S,E. Furthermore, if An = {T,,Q,U,q'(^,d, F), and qm is the marked state of An, we construct Aj^' = 
(S U {S,E},Q U {q^,qf},U,q^,S'^'^,{F \ {g,„}) U {qf}), where (5* is defined as follows: for each state 
q E Q\U, we add the pair (g, 1) to S{q, S) and S{q, E), if q is not go or q^, the pair (gf , 1) to 5(g,„, E), 
(go, 1) to S{qQ ,S), plus the pair (g^, —1) to each (5(g^, a) for a G S U {E}. 

The intuition is the following. Let R be an NRE and An be the A2FA constructed as above. Now 
assume that there is a semipath w = ui,ai,U2, ■ ■ ■ , u„, a„, u„+i and nodes Ui, Uj such that (u^, Uj) G [i?]s- 



By the above Lemma, we have that there is a computation tree for An that tacitly starts in (qo,w,i) and 
tacitly ends in {qmi w,j). The idea of the symbols S and E is to specifically mark the tacit start and end of 
the piece at, . . . ,aj-i labeling the semipath between Ui and Uj. Thus, in this case, Aj^ accepts the word 
oi • • • tti-iSui ■ ■ ■ aj-iEaj ■ ■ ■ an- It uses intuitively the same computation tree mentioned before, except now 
it moves backwards in state q^ until symbol S is reached, then proceeds with the computation, and the 
marked branch now ends in q^ instead of Qrm after checking there is a symbol E after aj-i. With this 
intuition, it is straightforward to show: 

Lemma 3.3 Let w ~ ui,ai, U2, . . . , u„, a„, Un+i be a graph over S that is a simple semipath, w = oi, . . . , a„ 
the label of the path w, and R a NRE. Then a pair {ui,Uj) belongs to [[i?|5 if and only if Aj^ accepts the 
word fli • • • Oi-iSat ■ ■ ■ Oj-iEoj ■ ■ ■ a„. 

We can now state our algorithm for solving SP-QueryContainment. On input NREs i?i and i?2, we 
perform the following operations: 

1. Compute an NFA A^'^ that accepts only those words over (S U {S, E})* of form W1SW2EW3, for each 
wi,'W2,W3 in S*. 

2. Compute Aj^ and Aj^^ as explained above. 

3. Compute the A2FA A'^ — (Aj^ Y whose language is the complement of Aj^ 

4. Compute the A2FA A whose language is the intersection of the languages A^'-^ , Aj^' and A'^. 

5. Check that the language of A is empty 

We have seen how to perform the second step in polynomial time, and steps (1), (3), (4) can be easily 
performed in PxiMEusing standard techniques from automata theory. Finally, Proposition 13.11 shows that 
step (5) can be performed in Pspace. Thus, all that is left to prove is that the language of the resulting 
automata A is empty if and only if Ri (I R2. 

Assume first that i?i C R2, and assume for the sake of contradiction that there is a word w G L{A). 
We have that w must be of form ai ■ ■ ■ ai-iSat ■ ■ ■ Oj-iEoj ■ ■ ■ a„, and w is accepted by Aj^ , but not 
by Aj^ . Let S' be a graph consisting of the semipath wi, oi, U2, . . . , ttn,a„, w„+i. By Lemma 13. 3[ nodes 
{ui,Uj) £ fRi^Sj and thus by our assumption {ui,Uj) must belong to |i?2]s, but this would imply, again by 
the lemma, that w is accepted by Aj^ . 

On the other hand if L(A) is empty but Ri ^ i?2, then for some graph S = ui, ai, M2, • . • , Un, a„, Un+i 
that is a semipath and nodes u^, Uj it is the case that (w^, Uj) £ |i?i]s, yet (ui, Uj) ^ |_Ri]5. By Lemma l3.3[ 
we have that w = ai ■ ■ ■ Oi-iSai ■ ■ ■ aj-iEoj • • ■ a„ is accepted by Aj^ , and it is not accepted by Aj^ , thus 
belonging to A'^. Since clearly w is also in the language of A^'-^ , this means that w belongs to L{A), which 
is a contradiction. 
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